6 research outputs found

    PQ-HPKE: Post-Quantum Hybrid Public Key Encryption

    Get PDF
    Public key cryptography is used to asymmetrically establish keys, authenticate or encrypt data between communicating parties at a relatively high performance cost. To reduce computational overhead, modern network protocols combine asymmetric primitives for key establishment and authentication with symmetric ones. Similarly, Hybrid Public Key Encryption, a relatively new scheme, uses public key cryptography for key derivation and symmetric key cryptography for data encryption. In this paper, we present the first quantum-resistant implementation of HPKE to address concerns that quantum computers bring to asymmetric algorithms. We propose PQ-only and PQ-hybrid HPKE variants and analyze their performance for two post-quantum key encapsulation mechanisms and various plaintext sizes. We compare these variants with RSA and classical HPKE and show that the additional post-quantum overhead is amortized over the plaintext size. Our PQ-hybrid variant with a lattice-based KEM shows an overhead of 52% for 1KB of encrypted data which is reduced to 17% for 1MB of plaintext. We report 1.83, 1.78, and 2.15 x10^6 clock cycles needed for encrypting 1MB of message based on classical, PQ-only, and PQ-hybrid HPKE respectively, where we note that the cost of introducing quantum-resistance to HPKE is relatively low

    Supersingular Isogeny Key Encapsulation (SIKE) Round 2 on ARM Cortex-M4

    Get PDF
    We present the first practical software implementation of Supersingular Isogeny Key Encapsulation (SIKE) round 2, targeting NIST\u27s 1, 2, 3, and 5 security levels on 32-bit ARM Cortex-M4 microcontrollers. The proposed library introduces a new speed record of all SIKE Round 2 protocols with reasonable memory consumption on the low-end target platforms. We achieved this record by adopting several state-of-the-art engineering techniques as well as highly-optimized hand-crafted assembly implementation of finite field arithmetic. In particular, we carefully redesign the previous optimized implementations of finite field arithmetic on the 32-bit ARM Cortex-M4 platform and propose a set of novel techniques which are explicitly suitable for SIKE primes. The benchmark result on STM32F4 Discovery board equipped with 32-bit ARM Cortex-M4 microcontrollers shows that entire key encapsulation and decapsultation over SIKEp434 take about 184 million clock cycles (i.e. 1.09 seconds @168MHz). In contrast to the previous optimized implementation of the isogeny-based key exchange on low-end 32-bit ARM Cortex-M4, our performance evaluation shows feasibility of using SIKE mechanism on the target platform. In comparison to the most of the post-quantum candidates, SIKE requires an excessive number of arithmetic operations, resulting in significantly slower timings. However, its small key size makes this scheme as a promising candidate on low-end microcontrollers in the quantum era by ensuring the lower energy consumption for key transmission than other schemes

    Time-Efficient Finite Field Microarchitecture Design for Curve448 and Ed448 on Cortex-M4

    Get PDF
    The elliptic curve family of schemes has the lowest computational latency, memory use, energy consumption, and bandwidth requirements, making it the most preferred public key method for adoption into network protocols. Being suitable for embedded devices and applicable for key exchange and authentication, ECC is assuming a prominent position in the field of IoT cryptography. The attractive properties of the relatively new curve Curve448 contribute to its inclusion in the TLS1.3 protocol and pique the interest of academics and engineers aiming at studying and optimizing the schemes. When addressing low-end IoT devices, however, the literature indicates little work on these curves. In this paper, we present an efficient design for both protocols based on Montgomery curve Curve448 and its birationally equivalent Edwards curve Ed448 used for key agreement and digital signature algorithm, specifically the X448 function and the Ed448 DSA, relying on efficient low-level arithmetic operations targeting the ARM-based Cortex-M4 platform. Our design performs point multiplication, the base of the Elliptic Curve Diffie-Hellman (ECDH), in 3,2KCCs, resulting in more than 48% improvement compared to the best previous work based on Curve448, and performs sign and verify, the main operations of the Edwards-curves Digital Signature Algorithm (EdDSA), in 6,038KCCs and 7,404KCCs, showing a speedup of around 11% compared to the counterparts. We present novel modular multiplication and squaring architectures reaching ~25% and ~35% faster runtime than the previous best-reported results, respectively, based on Curve448 key exchange counterparts, and ~13% and ~25% better latency results than the Ed448-based digital signature counterparts targeting Cortex-M4 platform

    Compressed SIKE Round 3 on ARM Cortex-M4

    Get PDF
    In 2016, the National Institute of Standards and Technology (NIST) initiated a standardization process among the post-quantum secure algorithms. Forming part of the alternate group of candidates after Round 2 of the process is the Supersingular Isogeny Key Encapsulation (SIKE) mechanism which attracts with the smallest key sizes offering post-quantum security in scenarios of limited bandwidth and memory resources. Even further reduction of the exchanged information is offered by the compression mechanism, proposed by Azarderakhsh et al., which, however, introduces a significant time overhead and increases the memory requirements of the protocol, making it challenging to integrate it into an embedded system. In this paper, we propose the first compressed SIKE implementation for a resource-constrained device, where we targeted the NIST recommended platform STM32F407VG featuring ARM Cortex-M4 processor. We integrate the isogeny-based implementation strategies described previously in the literature into the compressed version of SIKE. Additionally, we propose a new assembly design for the finite field operations particular for the compressed SIKE, and observe a speedup of up to 16% and up to 25% compared to the last best-reported assembly implementations for p434, p503, and p610

    PUF-Kyber: Design of a PUF-Based Kyber Architecture Benchmarked on Diverse ARM Processors

    No full text
    In this paper, through using physical unclonable functions (PUF) and true random number generators (TRNG), we improve the overall security of CRYSTALS-Kyber and provide physical security to it. Our implementation results on ARMv7 and ARMv8 architectures indicate significant speedup, compared to the reference work.</p

    PUF-Dilithium: Design of a PUF-Based Dilithium Architecture Benchmarked on ARM Processors

    No full text
    In this paper, by taking advantage of physical unclonable functions (PUFs), we introduce a novel design that provides physical security to CRYSTALS-Dilithium. After discussing the advantages of our design compared to the original design, we implemented it on two different architectures, ARMv7 and ARMv8. Our findings demonstrate substantial enhancements in both security and performance over the reference work.</p
    corecore